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Abstract. Wireless sensor networks benefit from communication protocols that reduce 
power requirements by avoiding frame collision. Time Division Media Access methods sched- 
ule transmission in slots to avoid collision, however these methods often lack scalability when 
implemented in ad hoc networks subject to node failures and dynamic topology. This paper 
reports a distributed algorithm for TDMA slot assignment that is self-stabilizing to tran- 
sient faults and dynamic topology change. The expected local convergence time is 0(1) for 
any size network satisfying a constant bound on the size of a node neighborhood. 

1 Introduction 

Collision management and avoidance are fundamental issues in wireless network protocols. Net- 
works now being imagined for sensors |23| and small devices |3| require energy conservation, 
scalability, tolerance to transient faults, and adaptivity to topology change. Time Division Me- 
dia Access (TDMA) is a reasonable technique for managing wireless media access, however the 
priorities of scalability and fault tolerance are not emphasized by most previous research. Recent 
analysis |S] of radio transmission characteristics typical of sensor networks shows that TDMA may 
not substantially improve bandwidth when compared to randomized collision avoidance protocols, 
however fairness and energy conservation considerations remain important motivations. In appli- 
cations with predictable communication patterns, a sensor may even power off the radio receiver 
during TDMA slots where no messages are expected; such timed approaches to power management 
are typical of the sensor regime. 

Emerging models of ad hoc sensor networks are more constrained than general models of dis- 
tributed systems, especially with respect to computational and communication resources. These 
constraints tend to favor simple algorithms that use limited memory. A few constraints of some 
sensor networks can be helpful: sensors may have access to geographic coordinates and a time 
base (such as GPS provides), and the density of sensors in an area can have a known, fixed upper 
bound. The question we ask in this paper is how systems can distributively obtain a TDMA as- 
signment of slots to nodes, given the assumptions of synchronized clocks and a bounded density 
(where density is interpreted to be a fixed upper bound on the number of immediate neighbors 
in the communication range of any node). In practice, such a limit on the number of neighbors in 
range of a node has been achieved by dynamically attenuating transmission power on radios. Our 
answers to the question of distributively obtaining a TDMA schedule are partial: our results are 
not necessarily optimum, and although the algorithms we present are self-stabilizing, they are not 
optimally designed for all cases of minor disruptions or changes to a stabilized sensor network. 

Before presenting our results, it may be helpful for the reader to consider the relation between 
TDMA scheduling and standard problems of graph coloring (since these topics often found in text- 
books on network algorithms for spatial multiplexing). Algorithmic research on TDMA relates the 
problem of timeslot assignment to minimal graph coloring where the coloring constraint is typi- 
cally that of ensuring that no two nodes within distance two have the same color (the constraint 
of distance two has a motivation akin to the well known hidden terminal problem in wireless net- 
works) . This simple reduction of TDMA timeslot assignment neglects some opportunities for time 



Fig. 1. two solutions to distance- two coloring 



division: even a solution to minimum coloring does not necessarily give the best result for TDMA 
slot assignment. Consider the two colorings shown in Figure Q which arc minimum distance- two 
colorings of the same network. We can count, for each node p, the size of the set of colors used 
within its distance- two neighborhood (where this set includes p's color); this is illustrated in Figure 
121 for the respective colorings of Figure We see that some of the nodes find more colors in their 
distance- two neighborhoods in the second coloring of Figure The method of slot allocation in 
Section allocates larger bandwidth share when the number of colors in distance-two neighbor- 
hoods is smaller. Intuitively, if some node p sees k < X colors in its distance-two neighborhood, 
then it should have at least a l/(fc + 1) share of bandwidth, which is superior to assigning a 
1/(A -I- 1) share to each color. Thus the problem of optimum TDMA slot assignment is, in some 
sense, harder than optimizing the global number of colors. 




Fig. 2. number of colors used within distance two 



Contributions. The main issues for our research are dynamic network configurations, transient 
fault tolerance and scalability of TDMA slot assignment algorithms. Our approach to both dy- 
namic network change and transient fault events is to use the paradigm of self-stabilization, which 
ensures the system state converges to a valid TDMA assignment after any transient fault or topol- 
ogy change event. Our approach to scalability is to propose a randomized slot assignment algorithm 
with 0(1) expected local convergence time. The basis for our algorithm is, in essence, a probabilis- 
tically fast clustering technique (which could be exploited for other problems of sensor networks). 
The expected time for all nodes to have a valid TDMA assignment is not 0(1); our view is that 
stabilization over the entire network is an unreasonable metric for sensor network applications; we 
discuss this further in the paper's conclusion. Our approach guarantees that after stabilization, if 
nodes crash, TDMA collision may occur only locally (in the distance-three neighborhood of the 
faults). 

Related Work. The idea of self-stabilizing TDMA has been developed in |12I13| for model that 
is more restricted than ours (a grid topology where each node knows its location) . Algorithms for 
allocating TDMA time slots and FDMA frequencies are formulated as vertex coloring problems 
in a graph ^Hl- Let the set of vertex colors be the integers from the range 0..A. For FDMA the 
colors (/„, fjju) of neighboring vertices (w, w) should satisfy \ fv ~ fw\ > 1 to avoid interference. The 
standard notation for this constraint is L{£i,£2)'. for any pair of vertices at distance i £ {1,2}, 
the colors differ by at least ii. The coloring problem for TDMA is: let ^'(£1,^2) be the constraint 
that for any pair of vertices at distance i G {1,2}, the colors differ by at least £i mod (A -|- 1). 
(This constraint represents the fact that time slots wrap around, unlike frequencies.) The coloring 
constraint for TDMA is L'{1, 1). Coloring problems with constraints L(1,0), L{0, 1), L(l, 1), and 



L(2, 1) have been well-studied not only for general graphs but for many special types of graphs 
|2I1UI17| : many such problems are NP-complete and although approximation algorithms have been 
proposed, such algorithms are typically not distributed. (The related problem finding a minimum 
dominating set has been shown to have a distributed approximation using constant time 
though it is unclear if the techniques apply to self-stabilizing coloring.) Self-stabilizing algorithms 
for i(l,0) have been studied in |5l2()ll8ll9l7j . and for L(l, 1) in [H]. Our algorithms borrow from 
techniques of self-stabilizing coloring and renaming |5I7| . which use techniques well-known in the 
literature of parallel algorithms on PRAM models To the extent that the sensor network 
model is synchronous, some of these techniques can be adapted; however working out details when 
messages collide, and the initial state is unknown, is not an entirely trivial task. This paper is 
novel in the sense that it composes self-stabilizing algorithms for renaming and coloring for a base 
model that has only probabilistically correct communication, due to the possibility of collisions at 
the media access layer. Also, our coloring uses a constant number of colors for the L(l, 1) problem, 
while the previous self-stabilizing solution to this problem uses colors. 



2 Wireless Network, Program Notation 

The system is comprised of a set V of nodes in an ad hoc wireless network, and each node 
has a unique identifier. Communication between nodes uses a low-power radio. Each node p can 
communicate with a subset Np C V oi nodes determined by the range of the radio signal; Np is 
called the neighborhood of node p. In the wireless model, transmission is omnidirectional: each 
message sent by p is effectively broadcast to all nodes in Np. We also assume that communication 
capability is bidirectional: q e Np iS p € Nq. Define Np = Np and for i > 1, Np = Np~^ U 
{r I {3q : q e Np'^^ ■ r G Nq) } (call N^ the distance-i neighborhood of p). Distribution of nodes 
is sparse: there is some known constant 5 such that for any node p, | A'pj < 5. (Sensor networks can 
control density by powering off nodes in areas that are too dense, which is one aim of topology 
control algorithms.) 

Each node has fine-grained, real-time clock hardware, and all node clocks are synchronized to a 
common, global time. Each node uses the same radio frequency (one frequency is shared spatially 
by all nodes in the network) and media access is managed by CSMA/CA: if node p has a message 
ready to transmit, but is receiving some signal, then p does not begin transmission until it detects 
the absence of signal; and before p transmits a message, it waits for some random period (as 
implemented, for instance, in |23)- We assume that the implementation of CSMA/CA satisfies 
the following: there exists a constant r > such that the probability of a frame transmission 
without collision is at least r (this corresponds to typical assumptions for multiaccess channels pQ; 
the independence of r for different frame transmissions indicates our assumption of an underlying 
mcmorylcss probability distribution in a Markov model). 

Notation. We describe algorithms using the notation of guarded assignment statements: G — > 
represents a guarded assignment, where G is a predicate of the local variables of a node, and S is an 
assignment to local variables of the node. If predicate G (called the guard) holds, then assignment 
S is executed, otherwise S is skipped. Some guards can be event predicates that hold upon the 
event of receiving a message: we assume that all such guarded assignments execute atomically 
when a message is received. At any system state where a given guard G holds, we say that G is 
enabled at that state. The [] operator is the nondeterministic composition of guarded assignments; 
([]g : q G Mp : Gq — > Sq) is a closed-form expression of Gqi Sq^ [] Gq.^ — > Sq^ [] ■ ■ ■ [] Gq^ 
Sq^ , where Mp = {qi, 52, • ■ • , qk}- 

Execution Semantics. The life of computing at every node consists of the infinite repetition 
of finding a guard and executing its corresponding assignment or skipping the assignment if the 
guard is false. Generally, wc suppose that when a node executes its program, all statements with 
true guards are executed in some constant time (done, for example, in round- robin order). 




Fig. 3. shared variable propagation 

Shared Variable Propagation A certain subset of the variables at any node are designated as 
shared variables. Nodes periodically transmit the values of their shared variables, based on a 
timed discipline. A simple protocol in our notation for periodic retransmission would be true — *■ 
transmit(varp) for each shared variable of p, where generally, we suppose that when a node executes 
its program, all statements with true guards are executed in some constant time (done, for example, 
in round-robin order). (One local node variable we do not explicitly use is the clock, which advances 
continuously in real time; guards and assignments could refer to the clock, but we prefer to 
discipline the use of time as follows.) 

Beyond periodic retransmission, assignment to a shared variable causes peremptory transmission: 
if a statement G — > S* assigns to a shared variable, then we present the statement without any 
reference to the clock and we suppose that there is a transformation of the statement into a 
computation that slows execution so that it does not exceed some desired rate, and also provides 
randomization to avoid collision in messages that carry shared variable values. This could be 
implemented using a timer associated with G — > S*. One technique for implementing G ^ S could 
be by the following procedure 

Suppose the previous invocation of the procedure for G ^ S finished at time t; the 
next evaluation of G ^ S* occurs at time t + (3, where /3 is a random delay inserted by 
the CSMA/CA implementation. After executing S (or skipping the assignment if G is 
false), the node transmits a message containing all shared variable values. This message 
transmission may be postponed if the node is currently receiving a message. Finally, after 
transmitting the message, the node waits for an additional k time units, where k is a 
given constant. Thus, in brief, G — > 5' is forever evaluated by waiting for a random period, 
atomically evaluating G S, transmitting shared variable(s), and waiting for a constant 
time period k. Figure |21 illustrates the cycle of shared variable propagation for one node. 

To reconcile our earlier assumption of immediate, atomic processing of messages with the discipline 
of shared variable assignment, no guarded assignment execution should change a shared variable 
in the atomic processing of receiving a message. All the programs in this paper have this property, 
that receipt of a message atomically changes only nonshared variables. 

Given the discipline of repeated transmission of shared variables, each node can have a cached 
copy of the value of a shared variable for any neighbor. This cached copy is updated atomically 
upon receipt of a message carrying a new value for the shared variable. 

Model Construction Our goal is to provide an implementation of a general purpose, collision-free 
communication service. This service can be regarded as a transformation of the given model of 
Section [21 into a model without collisions. This service simplifies application programming and 
can reduce energy requirements for communication (messages do not need to be resent due to 
collisions) . Let T denote the task of transforming the model of Section |21 into a collision- free 
model. 

We seek a solution to T that is self-stabilizing in the sense that, after some transient failure or 
reconfiguration, node states may not be consistent with the requirements of collision-free com- 
munication and collisions can occur; eventually the transformer corrects node states to result in 



collision-free communication. Our first design decision is to suppose that the implementation we 
seek is not itself free of collisions. That is, even though our goal is to provide applications a 
collision-free service, our implementation may introduce overhead messages susceptible to colli- 
sions. Initially, in the development of algorithms, we accept collisions and resends of these messages, 
which are internal to T and not visible to the application. 

To solve T it suffices to assign each node a color and use node colors as the schedule for a TDMA 
approach to collision-free communication Even before colors are assigned, we use a schedule 
that partitions radio time into two parts: one part is for TDMA scheduling of application messages 
and the other part is reserved for the messages of the algorithm that assigns colors and time slots 
to nodes. The following diagram illustrates such a schedule, in which each TDMA part has five 
slots. Each overhead part is, in fact, a fixed-length slot in the TDMA schedule. 
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The programming model, including the technique for sharing variables described in Section |21 
refers to message and computation activity in the overhead parts. It should be understood that 
the timing of shared variable propagation illustrated in Figure |31 may span overhead slots: the 
computation by the solution to T operates in the concatenation of all the overhead slots. Whereas 
CSMA/CA is used to manage collisions in the overhead slots, the remaining TDMA slots do not 
use random delay. During initialization or after a dynamic topology change, frames may collide in 
the TDMA slots, but after the slot assignment algorithm self-stabilizes, collisions do not occur in 
the TDMA slots. 

With respect to any given node f, a solution T is locally stabilizing with convergence time t if, 
for any initial system state, after at most t time units, every subsequent system state satisfies the 
property that any transmission by v during its assigned slot(s) is free from collision. Solution T is 
globally stabilizing with convergence time t if, for every initial state, after at most t time units, every 
subsequent system state has the property that all transmissions during assigned slots are free from 
collision. For randomized algorithms, these definitions are modified to specify expected convergence 
times (all stabilizing randomized algorithms we consider are probabilistically convergent in the 
Las Vegas sense). When the qualification (local or global) is omitted, convergence times for local 
stabilization are intended for the presented algorithms. 

Several primitive services that are not part of the initial model can simplify the design and expres- 
sion of T's implementation. All of these services need to be self-stabilizing. Briefly put, our plan 
is to develop a sequence of algorithms that enable TDMA implementation. These algorithms are: 
neighbor hood- unique naming, maximal independent set, minimal coloring, and the assignment of 
time slots from colors. In addition, we rely on neighborhood services that update cached copies of 
shared variables. 

Neighborhood Identification We do not assume that a node p has built-in knowledge of its neigh- 
borhood Np or its distance-three neighborhood N^. This is because the type of network under 
considering is ad hoc, and the topology dynamic. Therefore some algorithm is needed so that a 
node can refer to its neighbors. We describe first how a node p can learn of N^, since the technique 
can be extended to learn iV^ in a straightforward way. 

Each node p can represent for i S 1..3 by a list of identifiers learned from messages received 
at p. However, because we do not make assumptions about the initial state of any node, such list 
representations can initially have arbitrary data. Let L be a data type for a list of up to 6 items 
of the form a : A, where a is an identifier and ^ is a set of up to 6 identifiers. Let sLp be a shared 



variable of type L. Let message type mN with field of type L be the form of messages transmitted 
for sLp. Let Lp be a private variable of a type that is an augmentation of L ~ it associates a real 
number with each item: age{a : A) is a positive real value attached to the item. 

Function update(Lp, a : A) changes Lp to have new item information: if Lp already has some item 
whose first component is a, it is removed and replaced with a : A (which then has age zero); if Lp 
has fewer than S items and no item with a as first component, then a : A is added to Lp] if Lp 
has already S items and no item with a as first component, then a : A replaces some item with 
maximal age. 

Let maxAge be some constant designed to be an upper limit on the possible age of items in Lp. 
Function neighbors(ip) returns the set 

{q\q^p A {3 {a: A): (a : A) e Lp : a = q) } 

Given these variable definitions and functions, we present the algorithm for neighborhood identi- 
fication. 

NO: receive mN{a : A) update(Lp, a : A \ {p}) 

Nl: ([] {a : A) Lp : age{a : A) > maxAge Lp := Lp\{a: A)) 

N2: true sLp := {p : neighbors(ij,)) 

We cannot directly prove that this algorithm stabilizes because the CSMA/CA model admits the 
possibility that a frame, even if repeatedly sent, can suffer arbitrarily many collisions. Therefore 
the age associated with any element of Lp can exceed maxAge, and the element will be removed 
from Lp. The constant maxAge should be tuned to safely remove old or invalid neighbor data, yet 
to retain current neighbor information by receiving new mN messages before age expiration. This 
is an implementation issue beyond of the scope of this paper: our abstraction of the behavior of 
the communication layer is the assumption that, eventually for any node, the guard of Nl remains 
false for any {a : A) £ Lp for which a G Np. 

Proposition 1. Eventually, for every node p, sLp = Np holds continuously. 

Proof. Eventually any element (a : A) e Lp such that a ^ Np is removed. Therefore, eventually 
every node p can have only its neighbors listed in sLp. Similarly, with probability 1, each node p 
eventually receives an mN message from each neighbor, so sLp contains exactly the neighbors of 
P- 

By a similar argument, eventually each node p correctly has knowledge of Np as well as Np. The 
same technique can enable each node to eventually have knowledge of Np (it is likely that Np is 
not necessary; we discuss this issue in Sections |31 and ISJ . In all subsequent sections, we use Np for 
i £ 1..3 as constants in programs with the understanding that such neighborhood identification is 
actually obtained by the stabilizing protocol described above. 

Building upon Lp, cached values of the shared variables of nodes in N^, for i £ 1..3, can be 
maintained at p; erroneous cache values not associated with any node can be discarded by the 
aging technique. We use the following notation in the rest of the paper: for node p and some shared 
variable varq of node q £ N^, let ^ varq refer to the cached copy of varq at p. The method of 
propagating cached copies of shared variables is generally self-stabilizing only for shared variables 
that do not change value. With the exception of one algorithm presented in Section 13 all of 
our algorithms use cached shared variables in this way: eventually, the shared variables become 
constant, implying that eventually all cached copies of them will be coherent. 

For algorithms developed in subsequent sections, we require a stronger property than eventual 
propagation of shared variable values to their caches. We require that with some constant prob- 
ability, any shared variable will be propagated to its cached locations within constant time. This 



is tantamount to requiring that with constant probabihty, a node will transmit within constant 
time and the transmission will not collide with any other frame. Section |2 states our assumption 
on wireless transmission, based on the constant r for collision-free transmission. The discipline 
of shared variable propagation illustrated in Figure spaces shared- variable updates (or skipping 
updates when there is nothing to change) by k + /?, where /3 is a random variable. Our requirement 
on the behavior of transmitting shared variable values thus also implies that for any time t between 
transmission events, there is a constant probability that the next transmission event will occur by 
t + a for some constant a. Notice that the joint probability of waiting at most time a, and then 
sending without collision, is bounded below by a constant. It follows that the expected number 
of attempts to propagate a shared variable value before successfully writing to all its caches is 
0(1). (In fact, it would not change our analysis if random variable P is truncated by aborting 
attempted transmissions that exceed some constant timeout threshold.) We henceforth assume 
that the expected time for shared variable propagation is constant. 

Problem Definition. Let T denote the task of assigning TDMA slots so that each node has some 
assigned slot(s) for transmission, and this transmission is guaranteed to be collision-free. We seek 
a solution to T that is distributed and self-stabilizing in the sense that, after some transient 
failure or reconfiguration, node states may not be consistent with the requirements of collision- 
free communication and collisions can occur; eventually the algorithm corrects node states to result 
in collision-free communication. 

3 Neighborhood Unique Naming 

An algorithm providing neighborhood-unique naming gives each node a name distinct from any 
of its A^^-neighbors. This may seem odd considering that we already assume that nodes have 
unique identifiers, but when we try to use the identifiers for certain applications such as coloring, 
the potentially large namespace of identifiers can cause scalability problems. Therefore it can be 
useful to give nodes smaller names, from a constant space of names, in a way that ensures names 
are locally unique. 

The problem of neighborhood unique naming can be considered as an A^^-coloring algorithm and 
quickly suggests a solution to T. Since neighborhood unique naming provides a solution to the 
problem of L{1, 1) coloring, it provides a schedule for TDMA. This solution would be especially 
wasteful if the space of unique identifiers is larger than \V\. It turns out that having unique 
identifiers within a neighborhood can be exploited by other algorithms to obtain a minimal A'^^- 
coloring, so we present a simple randomized algorithm for A^^-naming. 

Our neighborhood unique naming algorithm is roughly based on the randomized technique de- 
scribed in [HI, and introduces some new features. Define A = \S*~\ for some t > 3; the choice of t to 
fix constant A has two competing motivations discussed at the end of this section. We call A the 
namespace. Let shared variable Idp have domain 0..A; variable Idp is the name of node p. Another 
variable is used to collect the names of neighboring nodes: Cidsp = {i^Idq \ q G \ {p} }. 
Let random (S*) choose with uniform probability some element of set S. Node p uses the following 
function to compute Idp: 



The algorithm for unique naming is the following. 

N3: true — > Idp ne\N\d{Idp) 

Define Uniq(p) to be the predicate that holds iff (i) no name mentioned in Cidsp is equal to /dp, 
(ii) for each q € N^, q ^ p, Idq ^ Idp, [Hi) for each q £ N^, one name in Cidsq equals Idp, (iv) for 




Idp if Idp ^ CidSj 

random (Z\ \ Cidsp) otherwise 
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each q £ Np, q p, the equahty S Idp = Idp holds at node and (w) no cache update message en 
route to p conveys a name that would update Cidsp to have a name equal to Idp. Predicate Uniq(p) 
states that p's name is known to all nodes in Np and does not conflict with any name of a node q 
within N^, nor is there a cached name liable to update Cidsp that conflicts with p's name. A key 
property of the algorithm is the following: Uniq(p) is a stable property of the execution. This is 
because after Uniq(p) holds, any node q in A'^ will not assign Idq to equal p's name, because N3 
avoids names listed in the cache of distance-three neighborhood names - this stability property 
is not present in the randomized algorithm The property (Vr : r £ R : Uniq(r)) is similarly 
stable for any subset R of nodes. In words, once a name becomes established as unique for all 
the neighborhoods it belongs to, it is stable. Therefore we can reason about a Markov model of 
executions by showing that the probability of a sequence of steps moving, from one stable set of 
ids to a larger stable set, is positive. 

Lemma 1. Starting from any state, there is a constant, positive probability that Uniq{p) holds 
within constant time. 

Proof. The proof has three cases for p: (a) Uniq(p) holds initially, (6) -iUniq(p) holds, but p cannot 
detect this locally (this means that there exists some neighbor q oip such that S Idp ^ Idp at q) ; or 
(c) p detects -iUniq(p) and chooses a new name. Case (a) trivially verifies the lemma. For case {b), 
it could happen that Uniq(p) is established only by actions of nodes other than p within constant 
time, and the lemma holds; otherwise we rely on the periodic mechanism of cache propagation 
and the lower bound r on the probability of collision-free transmission to reduce (b) to (c) with 
some constant probability within constant time. For case (c) we require a joint event, which is 
the following sequence: p chooses a name different from any in and their caches (or messages 
en route), then p transmits the new name without collision to Np, each node q G Np transmits 
the cache of p's name without collision, and then each node in Np \ Np transmits the cache of 
p's name without collision. Fix some constant time ^ for this sequence of events; time ^ could 
be {S + 1) ■ /i, where fi is the average time for a cached value to be transmitted without collision. 
The joint probability x for this scenario is the product of probabilities for each event, with the 
constraint that the event is transmission without collision within the desired time constraint /it. 
This sequence is not enough, however to fully estimate the probability for case (c), because it could 
be that nodes of Np concurrently assign new identifiers, perhaps equal to p's name. Therefore we 
multiply by x the product of probabilities that each invocation of newld by g S Np during the 
time period <P does not return a name equal to p's name. Notice that the number of times that 
any q G Np can invoke N3 is bounded by ^/k, because assignment to shared variables follows 
the discipline of at least k delay. Thus the entire number of invocations of newld in the (^-length 
time period is bounded by a constant. Therefore the overall joint probability is estimated by the 
product of X and a fixed number of constant probabilities; the joint probability for this scenario is 
thus bounded by a product of constant probabilities (dependent on A, S, r, and k). Because this 
joint probability is bounded below by a nonzero constant, the expected number of trials to reach 
a successful result is constant. 

Corollary 1. The algorithm self-stabilizes with probability 1 and has constant expected local con- 
vergence time. 

Proof. The Markov chain for the algorithm has a trapping state for any p such that Uniq(p) holds. 
The stability of Uniq(p) for each p separately means that we can reason about self-stabilization for 
each node independently. The previous lemma implies that each node converges to Uniq(p) with 
probability 1, and also implies the constant overall time bound. 

Using the names assigned by N3 is a solution to i(l, 1) coloring, however using A colors is not the 
basis for an efficient TDMA schedule. The naming obtained by the algorithm docs have a useful 
property. Let P be a path of t distinct nodes, that is, P = pi,p2, . . . ,pt. Define predicate Up{P) 
to hold if idp. < idp. for each i < j. In words, Up{P) holds if the names along the path P increase. 



Lemma 2. Every path P satisfying Up{P) has fewer than A + 1 nodes. 

Proof. If a path P satisfying Up{P) has A + I nodes, then some name appears at least twice in 
the path. The ordering on names is transitive, which impHes that some name a of a node in P 
satisfies a < a, and this contradicts the total order on names. 

This lemma shows that the simple coloring algorithm gives us a property that node identifiers 
do not have: the path length of any increasing sequence of names is bounded by a constant. 
Henceforth, we suppose that node identifiers have this property, that is, we treat iV* as if the node 
identifiers are drawn from the namespace of size A. 

There are two competing motivations for tuning the parameter t in A ~ S*. First, t should be 
large enough to ensure that the choice made by newld is unique with high probability. In the 
worst case, |iVp| =6^ + 1, and each node's cache can contain 6^ names, so a choosing t sa 6 
could be satisfactory. Generally, larger values for t decrease the expected convergence time of 
the neighborhood unique naming algorithm. On the other hand, smaller values of t will reduce 
the constant A, which will reduce the convergence time for algorithms described in subsequent 
sections. 



4 Leaders via Maximal Independent Set 

Simple distance two coloring algorithms may use a number of colors that is wastefuUy large. Our 
objective is to find an algorithm that uses a reasonable number of colors and completes, with high 
probability, in constant time. We observe that an assignment to satisfy distance two coloring can 
be done in constant time given a set of neighborhood leader nodes distributed in the network. The 
leaders dictate coloring for nearby nodes. The coloring enabled by this method is minimal (not 
minimum, which is an NP-hard problem). An algorithm selecting a maximal independent set is 
our basis for selecting the leader nodes. 

Let each node p have a boolean shared variable £p. In an initial state, the value of £p is arbitrary. 
A legitimate state for the algorithm satisfies (Vp ; p <E V : Cp), where 

Cp = {ip ^(iq: qeNp-. ^Iq)) 
A {^tp ^{3q: qeNp-. 

Thus the algorithm should elect one leader (identified by the ^-variable) for each neighborhood. 
As in previous sections, denotes the cached copy of the shared variable £p. 

Rl: (Vg : q G Np : q > p) tp true 

R2: ([] g : q € Np : m£g A q <p £p :^ false) 

R3: {3q : q € Np : q < p) A (Vg : q e Np A {q > p W -'^£q)) £p := true 

Although the algorithm does not use randomization, its convergence technically remains proba- 
bilistic because our underlying model of communication uses CSMA/CA based on random delay. 
The algorithm's progress is therefore guaranteed with probability 1 rather than by deterministic 
means. 

Lemma 3. With probability 1 the algorithm R1-R3 converges to a solution of maximal independent 
set; the convergence time is 0(1) if each timed variable propagation completes in 0(1) time. 



Proof. We prove by induction on the namespace that each node p stabiHzes its value of £p within 
0{A) time. For the base case, consider the set 5" of nodes with locally minimum names, that is, 
(Vp, q : p £ S A q € Np : p < q). Any node p € S stabilizes in 0(1) time to £p = true. The claim 
follows from the fact that guards of R2 and R3 are false, whereas the guard of Rl is permanently 
true. Therefore for the induction step, we can ignore Rl, as it is dealt with in the base case. 

To complete the induction, suppose that each node r has stabilized the value of where r < k. 
Now consider the situation of a node p with name fc + 1 (if there is no such node, the induction is 
trivially satisfied). As far as the guards of R2 and R3 are concerned, the value of £q is only relevant 
for a neighbor q with q < p, and for any such neighbor, £q is stable by hypothesis. Since guards of 
R2 and R3 are exclusive, it follows that p stabilizes £p and tSl£p is propagated within 0(1) time. 

Finally, we observe that in any fixed point of the algorithm R1-R3, no two neighbors are leaders 
(else R2 would be enabled for one of them), nor does any nonleader find no leader in its neighbor- 
hood (else Rl or R3 would be enabled). This establishes that Cp holds at a fixed point for every 
p G V. The induction terminates with at most \A\ steps, the size of the namespace, and because 
Z\ is a constant, the convergence time is 0{1) for this algorithm. 

5 Leader Assigned Coloring 

Our method of distance-two coloring is simple: colors are assigned by the leader nodes given by 
maximal independent set output. The following variables are introduced for each node p: 

colorp is a number representing the color for node p. 

min£p is meaningful only for p such that -^£p holds: it is intended to satisfy 

min£p = min { q \ q E Np A ^£q} 

In words, minip is the smallest id of any neighbor that is a leader. Due to the uniqueness of 

names in N^, the value minip stabilizes to a unique node. 
spectrurrip is a set of pairs (c, r) where c is a color and r is an id. Pertaining only to nonleader 

nodes, spectrum^ should contain {colorp, minip) and ([S] colorq, H minlq) for each q € Np. 
setcolp is meaningful only for p such that £p holds. It is an array of colors indexed by identifier: 

setcolp[q\ is p's preferred color for q S Np. We consider colorp and setcolp[p\ to be synonyms 

for the same variable. In the algorithm we use the notation setcolp\A\ -.=3 to denote the 

parallel assignment of a set of colors B based on a set of indices A. To make this assignment 

deterministic, we suppose that A can be represented by a sorted list for purposes of the 

assignment; B is similarly structured as a list. 
dorrip for leader p is computed to be the nodes to which p can give a preferred color; these include 

any q S Np such that minlq = p. We say for q £ domp that p dominates q. 
/ is a function used by each leader p to compute a set of unused colors to assign to the nodes in 

domp. The set of used colors for p is 

{ c I (3 (7, r : g G Np A (c, r) S spectrum^ A r < p) } 

That is, used colors with respect to p arc those colors in iV^ that are already assigned by 
leaders with smaller identifiers than p. The complement of the used set is the range of possible 
colors that p may prefer for nodes it dominates. Let / be the function to minimize the number 
of colors preferred for the nodes of domp, ensuring that the colors for domp are distinct, and 
assigning smaller color indices (as close to as possible) preferentially. Function / returns a 
list of colors to match the deterministic list of domp in the assignment of R5. 

R4: £p domp := {p} LI {q \ q e Np mirdq = p } 

R5; £p — > setcolp[domp\ :~ f{{c\3q : q Cz Np A r < p A (c, r) G IS spectrum^ } ) 



R6: true minip min { q \ q e NpU {p} A ^£q} 
R7: -i£p —>■ coloTp := ^ setcolr[p], where r = mindp 
R8: -i£p spectrunip := {colorp, minip) U U { ('^i'') I 
(3q,c,r : q € Np : c = 1>3 colorg A r = miniq) } 

Lemma 4. The algorithm R4-R8 converges to a distance-two coloring, with probability 1; the 
convergence time is 0(1) if each timed variable propagation completes in 0{\) time. 

Proof. The proof is a sequence of observations to reflect the essentiaUy sequential character of 
color assignment. We consider an execution where the set of leaders has been established by Rl- 
R3 initially. Observe that in 0(1) time the assignments of R6 reach a fixed point, based on the local 
reference to IS] Iq for neighbors. Therefore, in 0(1) time, the shared variables minip are propagated 
to Np and caches tSlminlp are stable. Similarly, in 0(1) additional time, the assignments of R4 
reach a fixed point, so that leaders have stable dom variables. 

The remainder of the proof is an induction to show that color assignments stabilize in 0{A) phases 
(recall that A is the constant of Lemma For the base case of the induction, consider the set S 
of leader nodes such that for every p ^ S, within iV^ no leader of smaller name than p occurs. We 
use distance three rather than distance two so that such a leader node's choice of colors is stable, 
independent of the choices made by other leaders. Set S is non-empty because, of the set of leaders 
in the network, at least one has minimal name, which is unique up to distance three. Call S the set 
of root leaders. Given such a leader node p, each neighbor q e Np executes R8 within 0(1) time 
and assigns to spectrum^ a set of tuples with the property that for any (c, r) G spectrum^ , f ^ P- 
Notice that although spectrum^ could subsequently change in the course of the execution, this 
property is stable. Therefore, in 0(1) additional time, no tuple of K] spectrum^ has a smaller value 
than p in its second component. It follows that any subsequent evaluation of R5 by leader p has a 
fixed point: p assigns colors to all nodes of Np. After 0(1) delay, for q £ iVp, E] setcolp stabilizes. 
Then in 0(1) time, all nodes of domp assign their color variables using R7. This completes the 
base case, assignment of colors by root leaders. 

We complete the induction by examining nodes with minimum distance A; > from any root 
leader along a path of increasing leader names (referring to the Up predicate used in Lemma E)). 
The hypothesis for the induction is that nodes up to distance k — 1 along an increase path of 
leader names have stabilized to a permanent assignment of colors to the nodes they dominate. 
Arguments similar to the base case show that such nodes at distance k eliminate colors already 
claimed by leaders of the hypothesis set in their evaluations of R5. The entire inductive step — 
extending by one all paths of increasing names from the root leaders — consumes 0(1) additional 
time. The induction terminates at Z\, thanks to Lemma|21 hence the overall bound of 0(Z\) holds 
for convergence. 

Only at one point in the proof do we mention distance-three information, which is to establish the 
base case for root leaders (implicitly it is also used in the inductive step as well). Had we only used 
neighborhood naming unique up to distance two, it would not be ensured that a clear ordering of 
colors exists between leaders that compete for dominated nodes, eg, a leader p could find that some 
node r G Np has been assigned a color by another leader g, but the names of p and q are equal; 
this conflict would permit q to assign the same color to r that p assigns to some neighbor of r. 
We use distance-three unique naming to simplify the presentation, rather than presenting a more 
complicated technique to break ties. Another useful intuition for an improved algorithm is that 
Lemma |2Is result is possibly stronger than necessary: if paths of increasing names have at most 
some constant length d with high probability, and the algorithms for leader selection and color 
assignment tolerate rare cases of naming conflicts, the expected convergence time would remain 
0(1) in the construction. 

Due to space restrictions, we omit the proof that the resulting coloring is minimal (which follows 
from the construction of / to be locally minimum, and the essentially sequential assignment of 
colors along paths of increasing names). 



6 Assigning Time Slots from Colors 



Given a distance-two coloring of the network nodes, the next task is to derive time slot assignments 
for each node for TDMA scheduling. Our starting assumption is that each node has equal priority 
for assigning time slots, ie, we arc using an unweighted model in allocating bandwidth. Before 
presenting an algorithm, we have two motivating observations. 

First, the algorithms that provide coloring are local in the sense that the actual number of colors 
assigned is not available in any global variable. Therefore to assign time slots consistently to all 
nodes apparently requires some additional computation. In the first solution of Figure ^ both 
leftmost and rightmost nodes have color 1 , however only at the leftmost node is it clear that color 
1 should be allocated one ninth of the time slots. Local information available at the rightmost 
node might imply that color 1 should have one third of the allocated slots. 

The second observation is that each node p should have about as much bandwidth as any other 
node in N"^. This follows from our assumption that all nodes have equal priority. Consider the 
N"^ sizes shown in Figure |21 that correspond to the colorings ofQ] The rightmost node p in the 
first coloring has three colors in its two-neighborhood, but has a neighbor q with four colors in its 
two-neighborhood. It follows that q shares bandwidth with four nodes: g's share of the bandwidth 
is at most 1/4, whereas p's share is at most 1/3. It does not violate fairness to allow p to use 1/3 
of the slot allocation if these slots would otherwise be wasted. Our algorithm therefore allocates 
slots in order from most constrained (least bandwidth share) to least constrained, so that extra 
slots can be used where available. 

To describe the algorithm that allocates media access time for node p, we introduce these shared 
variables and local functions. 

bascp stabilizes to the number of colors in N^. The value base^^ ~ l/bascp is used as a constraint 
on the share of bandwidth required by p in the TDMA slot assignment. 

itvlp is a set of intervals of the form [x,y) where < x < y < 1. For allocation, each unit of 
time is divided into intervals and itvlp is the set of intervals that node p can use to transmit 
messages. The expression |[a;,y)| denotes the time- length of an interval. 

g{b, S) is a function to assign intervals, where S' is a set of subintervals of [0, 1). Function g{b, S) 
returns a maximal set T of subintervals of [0, 1) that are disjoint and also disjoint from any 
element of S such that (X^aeT I'*!) — ^■ 

To simplify the presentation, we introduce Sp as a private (nonshared) variable. 

R9: true bascp := | { H colorq \ q £ Np} \ 

RIO: true := IJ { ^ ^^^'^g I 9 e A 

([SI basCq > has Bp V 

(ES basCq = basBp A IS colorq < colorp)) } 
Rll: true — > itvlp := g{basep^ , Sp) 

Lemma 5. With probability 1 the algorithm R9—R11 converges to an allocation of time intervals 
such that no two nodes within distance two have conflicting time intervals, and the interval lengths 
for each node p sum to \{ colorq \ q G }\~^ ; the expected convergence time of R9-R11 is 0(1) 
starting from any state with stable and valid coloring. 

Proof. Similar to that for Lemma^ we omit details. 

It can be verified of R9-R11 that, at a fixed point, no node q G N'p is assigned a time that overlaps 
with interval(s) assigned to p; also, all available time is assigned (there are no leftover intervals). A 
remaining practical issue is the conversion from intervals to a time slot schedule: a discrete TDMA 
slot schedule will approximate the intervals calculated by g. 



7 Assembly 



Given the component algorithms of Sections [SHHl the concluding statement of our result follows. 

Theorem 1. The composition of N0-N3 and R1~R11 is a probabilistically self- stabilizing solution 
to T with 0(1) expected local convergence time. 

Proof. The infrastructure for neighborhood discovery and shared variable propagation N0-N2 
contributes 0(1) delay (partly by assumption on the CSMA/CA behavior), and N3 establishes 
neighborhood unique naming in expected 0(1) time. The subsequent layers R1-R3, R4-R8, and 
R9-R11, each have 0(1) expected convergence time, and each layer is only dependent on the output 
of the previous layer. The hierarchical composition theorem (see |2I]) implies overall stabilization, 
and the expected convergence time is the sum of the expected convergence times of the components. 

8 Conclusion 

Sensor networks differ in characteristics and in typical applications from other large scale networks 
such as the Internet. Sensor networks of extreme scale (hundreds of thousands to millions of nodes) 
have been imagined motivating scalability concerns for such networks. The current generation 
of sensor networks emphasizes the sensing aspect of the nodes, so services that aggregate data 
and report data have been emphasized. Future generations of sensor networks will have significant 
actuation capabilities. In the context of large scale sensor/actuator networks, end-to-end services 
can be less important than regional and local services. Therefore we emphasize local stabilization 
time rather than global stabilization time in this paper, as the local stabilization time is likely to be 
more important for scalability of TDMA than global stabilization time. Nonetheless, the question 
of global stabilization time is neglected in previous sections. We speculate that global stabilization 
time will be sublinear in the diameter of the network (which could be a different type of argument 
for scalability of our constructions, considering that end-to-end latency would be linear in the 
network diameter even after stabilization). Some justification for our speculation is the following: 
if the expected local time for convergence is 0(1) and underlying probability assumptions are 
derived from Bernoulli (random name selection) and Poisson (wireless CSMA/CA) distributions, 
then these distributions can be approximately bounded by exponential distributions with constant 
means. Exponential distributions define half-lives for populations of convergent processes (given 
asymptotically large populations), which is to say that within some constant time 7, the expected 
population of processes that have not converged is halved; it would follow that global convergence 
is O(lgn). 

We close by mentioning two important open problems. Because sensor networks can be deployed 
in an ad hoc manner, new sensor nodes can be dynamically thrown into a network, and mobility 
is also possible, the TDMA algorithm we propose could have a serious disadvantage: introduction 
of just one new node could disrupt the TDMA schedules of a sizable part of a network before 
the system stabilizes. Even if the stabilization time is expected to be 0(1), it may be that better 
algorithms could contain the effects of small topology changes with less impact than our proposed 
construction. One can exploit normal notifications of topology change as suggested in 0] , for exam- 
ple. Another interesting question is whether the assumption of globally synchronized clocks (often 
casually defended by citing GPS availability in literature of wireless networks) is really needed for 
self-stabilizing TDMA construction; we have no proof at present that global synchronization is 
necessary. 
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